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Abstract 

In the paper, heuristics which use predicted process resource requirements to make scheduling decisions 
are proposed. Four heuristics are presented. The first two, MINQ and SMPL, employ centralized 
scheduling and the remaining two, DMINQ and FDMINQ, use distributed scheduling. These heuristics 
are first compared against random scheduling and then against two conventional heuristics, CENTEX 
and DISTED, which schedule processes solely based on system state information. Results based on 
trace-driven simulations show that the proposed centralized heuristics offer significantly improved mean 
response times and, they require fewer status update messages. In experiments using the same status 
update rates, SMPL response times were, on the average, 22% lower than those for CENTEX and, MINQ 
response times were, on the average, 18% lower. The simulations also showed that MINQ and SMPL 
can perform as well as, or better than, CENTEX while using up to 70% fewer status update messages. 

The use of fewer status update messages imposes less overhead on the system. The use of prediction for 
distributed scheduling produced similar results. When prediction was used to filter small processes and 
execute them locally a 50% improvement in response times was obtained. 

Index Term s-Distributed systems, load sharing, statistical clustering, resource prediction, dynamic schedul- 
ing and trace-driven simulation. 
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1. Introduction 

Dynamic load sharing heuristics have been studied extensively in the past. Most studies 
assume that the resource requirements of processes are not known apriori [Casavant 88]. How- 
ever, better load sharing should be possible if the scheduler uses information on process 
resource requirements to make scheduling decisions. The results in [Devarakonda 89] show that 
it is possible to predict the CPU, memory, and I/O requirements of a process using a statistical 
pattern-recognition technique. This paper is concerned with the use of this prediction methodol- 
ogy to make scheduling decisions and to reduce the overhead of load sharing. 

Four heuristics are proposed which use predicted process resource requirements to 
influence scheduling decisions. The first two (MINQ and SMPL) employ centralized scheduling 
and the remaining two (DMINQ and FDMINQ) use distributed scheduling. MINQ uses the 
predicted process resource requirements to determine the load (as measured by the CPU queue 
length) on the processors. The incoming processes are then directed to the processor with the 
least load. SMPL uses predicted process resource requirements to estimate the response time 
that a process will receive at each processor. The process is then sent to the processor offering 
the lowest estimated response time. DMINQ is simply a distributed version of the MINQ heuris- 
tic. FDMINQ employs a prediction-based filtering mechanism to identify and execute small 
processes locally. 

The proposed heuristics which use prediction are first compared against random schedul- 
ing and then against two conventional heuristics, CENTEX and DISTED, which schedule 
processes based solely on system state information [Zhou 86]. CENTEX has been shown, 
through trace-driven simulations, to be a very effective centralized heuristic and DISTED is a 


distributed version of CENTEX. 
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Results based on trace-driven simulations show that the proposed centralized heuristics 
offer improved mean response times and perform well while using fewer status update mes- 
sages. The simulations reveal that MINQ and SMPL perform as well as, or better than, CEN- 
TEX while using up to 70% fewer status update messages. In experiments using an equal 
number of status update messages, SMPL response times were 22% lower than those produced 
by CENTEX and, MINQ response times were 18% lower. 

The use of prediction for distributed scheduling produced similar results. Under moderate 
to high loads, the response times for DMINQ were 18% lower than those of DISTED. When 
prediction was used for filtering small processes and executing them locally (FDMINQ) a 
significant reduction in response times was obtained; the mean response time was up to 50% 
lower than that produced by DISTED. 

The following section discusses recent related work in this area and describes the tech- 
nique used to predict process resource requirements. Section 3 presents the four proposed 
scheduling policies, MINQ, SMPL, DMINQ and FDMINQ. Sections 4 and 5 present the simu- 
lation model and the results of the trace-driven simulations, respectively. Finally, Section 6 
summarizes the important findings and suggests directions for future research. 

2. Background 

The area of dynamic load balancing has been widely investigated ([Eager 86], [Hwang 
82], [Krueger 84], [Leland 86], [Livny 82], [Stankovic 85], [Zhou86]). Studies that have a bear- 
ing with the research presented in this paper include [Livny 82] which presents a comprehensive 
study of several load-sharing heuristics and demonstrates that even with communication delays 
and processing overheads, load-sharing can improve the response time of a system. In [Stanko- 
vic 85] stability issues in load-sharing are discussed and heuristics that use a stochastic learning 
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automata to reduce instability are described. In [Barak 85] results of an actual implementation 
are described and the use of average rather than the instantaneous load measurements is dis- 
cussed. In [Eager 86] an analytical study of three simple load-sharing policies is presented; the 
authors demonstrate that simple policies provide a significant improvement in response time. 
More complex policies are shown to provide only a marginal improvement over these simpler 
policies. 

Research most closely related to that presented in this paper is that of [Zhou 86] and 
[Leland 86], In [Zhou 86] a centralized, dynamic load-sharing policy, CENTEX, and a distri- 
buted policy, DISTED, are proposed. CENTEX uses the average CPU queue length to indicate 
the processor load. It directs incoming processes to the processor with the shortest queue 
length. Periodically, each processor sends an update message, consisting of its current CPU 
queue length, to the central scheduler. Using simulations, based on real trace data, the author 
shows that CENTEX produces lower overall mean response times than DISTED, a distributed 
version of CENTEX. Zhou also demonstrates that CENTEX performs as well, or better than, 
the three simple distributed policies described in [Eager 86]. 

Leland and Ott [Leland 86] analyzed 9.5 million Unix processes and found that the resi- 
dual CPU time needed by a process is linearly related to its age,( i.e.. The authors subsequently 
develop a spiral assignment scheme which schedules processes based on their age. Several 
heuristics based on this spiral assignment idea are developed and analyzed via trace-driven 
simulations. 

The use of predicted process resource requirements to influence scheduling has not been 
fully explored in the literature. In systems using a round robin CPU scheduling policy, selection 
of the best processor depends on i) the number of processes in the processor, ii) the resource 
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requirements of these processes and, iii) the resource requirement of the process being 
scheduled. The first is easily obtained and the other two can be estimated by using a prediction 
technique. It can be argued that Leland and Ott implicitly use prediction since process age is 
used to estimate ("predict") its residual CPU requirement. The heuristics proposed in this study 
use a more direct approach. 

The prediction method described in [Devarakonda 89] uses a statistical pattem- 
recognition-based approach to predict the CPU time, the file I/O and the memory usage of a 
process at the beginning of its life, given the identity of the program being executed 1 . The 
method was based on an analysis of over 65,000 UNIX processes. Initially, a statistical cluster- 
ing algorithm is used to identify high-density regions of process resource usage. These regions 
(defined as states) are used to build a state-transition model to characterize the resource usage of 
the past executions of a program. The prediction scheme uses the resource usage of a program’s 
last execution and the program’s state-transition model to estimate the resource requirements for 
its next execution. In experiments using this approach, the coefficient of correlation between the 
predicted and actual CPU requirements of processes analyzed was found to be 0.84 out of 1.0. 
All the heuristics proposed in this study use Devarakonda’s prediction scheme. 


3.0 The Proposed Load Sharing Heuristics 

This section proposes the four dynamic load-sharing heuristics that use predicted process 
resource requirements to make scheduling decisions. The first two (MINQ and SMPL) are cen- 
tralized heuristics, and the remaining two (DMINQ and FDMINQ) are distributed versions of 
MINQ. All four heuristics assume that processes are logically independent of one another and 


1 A process is an execution or instance of a program. 
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are irrevocably assigned to a processor (i.e., once assigned a process is never migrated to 
another processor). 

The framework for the centralized load-sharing heuristics is illustrated in Figure 1. The 
box containing the predictor and the scheduler is referred to as the central scheduler. When a 
process arrives at the central scheduler, its -identification is sent to the predictor which predicts 
the resource requirements of the process. These predicted values are then fed to the scheduler 
which, based on the specifics of the load-sharing policy, identifies the processor that will house 
the process. When a process completes execution, its actual resource usage is stored by the pro- 
cessor in a buffer. Periodically, each processor sends a status update message, consisting of the 
contents of this buffer, to the central scheduler which in turn uses this information to update the 
appropriate state transition model. 


processes 



Figure 1. Framework of centralized load-sharing with prediction 
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Figure 2 depicts the framework for the distributed heuristics. The approach is similar to 
centralized scheduling, except that now each node has its own scheduler. When a process arrives 
at a node it is first sent to the predictor to estimate its resource requirements. Next, the 
scheduler is invoked to select the processor that will house the process. Periodically each pro- 
cessor sends a status update message to all the other processors. 

3.1 The MINQ Load-Sharing Heuristic 

The MINQ scheduling policy is the simplest of the two centralized heuristics. For each 
processor, i, the scheduler maintains a queue containing the predicted CPU and I/O require- 
ments of every process executing in the processor. When a process, X, arrives, the scheduler 
estimates the CPU load on each processor and sends the process to the processor with the 



Figure 2. Framework of distributed load-sharing with prediction 
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lowest estimated load. The sequence of steps taken by the scheduler is outlined below. 


1 . The estimated average CPU load at each processor i is computed as follows: 


CPU_LOAD i = % -cnSS^m 

where: 

Ni = the number of processes in processor i 
lOREQj is the predicted I/O requirement 
of process j in units of time 
CPUREQj is the predicted CPU requirement 
of process j in units of time 

Note that CPU LOAD is an estimate of the CPU queue length. 

2. The process identification is fed to the predictor to obtain the predicted CPU and 
I/O requirement. 

3. The process is sent to processor k with the smallest CPU LOAD value. 

4. An entry containing the predicted CPU and I/O requirement of process X is 
added to processor k ’s queue. 

Recall that at periodic intervals each processor sends a status update message consisting of 
the actual resources used by its processes to the central scheduler. Upon receiving a status 
update message, the central scheduler performs the following processing steps for each process 
in the message. 


1. The process identification and the actual CPU and I/O usage figures are fed to the 
predictor in order to update the state transition diagram for that program. 

2. The appropriate process entry is deleted from the queue of the processor which 
sent the message. 


3.2 The SMPL Load-Sharing Heuristic 

When scheduling a process, SMPL takes the number of processes in a processor, the CPU 
requirement of these processes and the CPU requirement of the process being scheduled, into 
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account. Like the MINQ heuristic, SMPL maintains a queue for each processor. As processes 
arrive and are scheduled, entries are placed in the appropriate queue and as processes are com- 
pleted, their entries are deleted from the queues. The fundamental difference between MINQ and 
SMPL lies in the policy used to select the processor with the least load. Recall that MINQ sim- 
ply picks the processor with the smallest estimated CPU load. However, when round robin 
CPU scheduling is used, selecting the processor with the smallest CPU load may not necessarily 
guarantee the best response time for the process. When a process, X, is to be scheduled, the 
SMPL approach estimates the response time that the process will receive at each processor and 
selects the processor offering the lowest estimated response time. For each processor, i , the 
scheduler computes the response time in two steps. First, it sums the predicted CPU require- 
ment of all processes smaller than process X. Next, for each of the remaining processes, an 
amount equal to the CPU requirement of process X is added to this sum. This final value is the 
estimated response time that X will receive at processor i . The detailed steps performed by 
SMPL are outlined below. 


1. The predicted resource requirements of process X (CPU _REQ pr ocessx , 

10 _REQ pr0 cusx) are determined via the prediction mechanism. 

2. Given a round robin CPU scheduling discipline, the estimated response time, r, , 
that process X will receive at each processor, i , is computed: 


n = V/ x CPUREQj + (1-/) x CPU REQpncuaX 


/ = 


itCPU REQj < CPU REQprocujx 
otherwise 


where: 

Ni = the number of processes in processor i 
CPU REQj = the amount of CPU required 
by process j 


3. The processor with the lowest r, value is selected to house process X. 

4. Process X is added to the appropriate processor’s queue. 
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As in the MINQ heuristic, the central scheduler periodically receives status update mes- 
sages containing the actual I/O and CPU time used by the processes that have been completed. 
The SMPL load-sharing heuristic performs the same status update processing as MINQ. 

3 3 The DMINQ Load-Sharing Heuristic 

In DMINQ (a distributed version of MINQ) each processor has a local scheduler which main- 
tains a separate queue for each processor in the system. Queue i contains the predicted CPU 
and I/O requirements of every process that the local scheduler has sent to processor i . When a 
process arrives at a processor, if the load on the processor is less than a pre-specified threshold 
T, the process is housed in the same processor 2 . Otherwise, the local scheduler estimates the 
CPU load on every processor in the system via the information in its queues, and assigns the 
process to what it believes to be the least loaded processor. The scheduler then adds the process 
to its queue for the selected processor. 

Periodically, each processor broadcasts a status update message, containing a list of all the 
processes that have been completed since the last status message was broadcast. The local 
schedulers use this message to refresh their global view of the system. 

3.4 The FDMINQ Load-Sharing Heuristic 

FDMINQ is identical to DMINQ except that, instead of using a fixed threshold mechanism 
based on the number of processes in a processor, it uses a filtering mechanism based on predic- 
tion. Predicted resource requirements are used to identify and filter out small processes (i.e., 

2 A similar threshold mechanism is used by DISTED [2Diou 86]. In addition, DISTED also filters processes based on their 
name. Studies by Zhou show that certain processes are typically large while others are usually small and that process names can be 
used to distinguish between them. Due to a lack of implementation detail in (2hou 86], this feature was not implemented for the 
DISTED heuristic used in this paper. 
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processes requiring little CPU time). Thus, regardless of the load on the processor, all small 
processes are executed locally. As will be seen, this reduces the burden on the scheduler and 
significantly improves the response times for small processes. 

4.0 Experiment Design 

4.1 The System 

The load-sharing heuristics were tested on a simulated distributed system consisting of 
homogeneous processors that are connected by a single communication channel. Each proces- 
sor is assumed to have infinite memory and use a pre-emptive round robin CPU scheduling dis- 
cipline with a 100 millisecond time-slice. Process scheduling and message transfer have prior- 
ity over process execution. A distributed file system is assumed so that the cost of accessing 
files is roughly the same for all hosts. This model is representative of a typical Ethernet-based 
distributed environment 3 . 

Only the CPU overhead of sending and receiving status update messages by the load- 
sharing heuristic is modeled. Thus, the I/O overhead and the message traffic produced by the 
application processes and by the load-sharing heuristics are not modeled. This assumption gives 
us conservative results. As indicated by the measurements in section 5.1, consideration of the 
impact of message traffic will only further enhance our results. Twenty milliseconds of CPU 
time is assumed to be needed to send a status message and 10 milliseconds is needed to receive 
and process a status update message. A cost of 100 milliseconds of CPU time is incurred by a 
processor when a process is transferred to it. These estimates, for the type of system studied. 


1 It should be emphasized that the proposed heuristics can be adapted to the specifics of a given topology. 
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were obtained from [Zhou 86]. Our measurements show that it takes approximately 5 mil- 
liseconds of CPU time to make a prediction. This cost was also included in the simulations. 

42 Input Trace File 

An actual trace file of 37,000 processes, executed on a VAX 11/780 running 4.3BSD 
Unix, was used as input to the simulated system. Each process in the trace file was defined by 
its identification number, its arrival time, its CPU, I/O, and memory requirements. Since the 
trace file contains logical I/O performed by a process, a file cache with a hit ratio of 75% was 
modeled. A file cache access time of 0.2 milliseconds and a cache miss time of 70 milliseconds 
was assumed. The process arrival rate was varied to observe the system under various loads. In 
simulating the distributed heuristics, processes were read from the trace file and randomly sent 
to the processors. Fifteen thousand processes were input to the system for each experiment. 

5.0 Experiment Results and Discussion 

Experiments were conducted to investigate system sizes varying from 6 to 25 nodes. 
Since the basic findings were similar, only results 

Table 1. Process Arrival Rates for each Load 


Load 

Arrival Rate (processes/sec) 

1 

2.8 

2 

3.6 

3 

4.7 

4 

5.7 

5 

7.1 

6 

8.1 


of experiments conducted on a 20 node system are presented in this paper. The performance 
metric used to judge the load-sharing policies was the mean response time of the processes. All 
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the heuristics were simulated using various parameters (e.g., status update interval, threshold 
value) and only the best response times attained are shown in Table 1. 

The range of loads under which the heuristics were executed are also shown in Table 1. 

5.1 Comparison of Centralized Heuristics 

5.1.1 Comparison of MINQ, SMPL and CENTEX 


Typically, random assignment is used as a reference against which new heuristics are com- 
pared. Figure 3 shows a comparison of MINQ against RND which randomly assigns a process 
to a processor. It is clear that MINQ yields substantial performance gains over RND at all loads 
tested. 


Mean Response 
time in sec. 



Increasing Load •> 


Figure 3. Comparison of MINQ and a random policy 
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Figure 4 compares the best response times obtained by MINQ and SMPL, which use pred- 
iction, and CENTEX, which does not use prediction. MINQ and SMPL consistently yield better 
response times, for all range of loads, than those produced by CENTEX. The response times of 
SMPL are as much as 30% lower than CENTEX. On the average, SMPL response times are 
21% lower than CENTEX and MINQ response times are about 18% lower. 

MINQ and SMPL perform better than CENTEX because they use predicted process 
resource requirement information to maintain an accurate running estimate of the processor load. 
When a process is to be scheduled, its predicted resource requirements are used to determine the 
load that it will place upon a processor. CENTEX, on the other hand, has no process specific 


Mean Response 
time in sec. 



Increasing Load -> 


Figure 4. Comparison of MINQ, SMPL & CENTEX 
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information and hence has no a priori knowledge of the load imposed by a process. In order to 
estimate the load, CENTEX keeps a running account of the CPU queue length at each processor 
and, adds a constant (typically a 1) to the processor’s queue length each time a process is sent to 
it 

Figure 4 also shows that SMPL performs only slightly better than MINQ. The advantage 
of SMPL, as shown in the next subsection, is that its performance is less sensitive to the status 
update interval than that of MINQ. 

5.1.2 Impact of Varying Status Update Intervals 


Mean Response 
time in sec. 



12 sec. 
6 sec. 
12 sec. 


Figure 5a. MINQ and SMPL using various status update rates 
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A load-sharing heuristic which requires frequent status updates can significantly increase 
the CPU time needed for message processing. In addition, it can substantially contribute to the 
message traffic in the system. Prediction makes it possible to lower these overheads because it 
allows MINQ and SMPL to maintain their performance while using fewer status update rates. 

Figure 5a compares SMPL and MINQ with respect to status update periods. The figure 
shows the response times for SMPL with a 12 second update interval and compares it with 
MINQ with 6 and 12 second update intervals. It is clear that SMPL performs just as well as 
MINQ while using a status update interval that’s twice as slow. The reason is because SMPL 
explicitly tries to predict the response time as opposed to MINQ, which simply predicts the load 

9 
8 
7 

Mean Response 6 
time in sec. 

5 
4 
3 

1 2 3 4 5 6 

Increasing Load -> 



Figure 5b. SMPL using a four time slower update rate than CENTEX 
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at a processor. As seen in Figure 4, for very short intervals, the difference between the two 
heuristics becomes insignificant because the disadvantage of MINQ’s coarse prediction is offset 
by the frequent update of status information. 

Figure 5b compares the response times of SMPL (with a 12 second update interval) and 
CENTEX (with a 3 and a 12 second update interval) 4 . Even with an update interval which is 
four times as large, SMPL achieves response times that are up to 10% lower than that of CEN- 
TEX. As mentioned in the previous subsection, CENTEX does not sustain its performance 
when using large intervals because it lacks process specific information. SMPL used 


Mean Response 
time in sec. 



Increasing Load -> 


Figure 6. Comparison of DMINQ, DISTED (with T=2) & FDMINQ 


4 CENTEX achieves its best response time when using a 3 second update interval. 
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approximately 70% fewer status update messages than that used by CENTEX executed with a 3 
second interval. Recall that only the CPU oveihead of sending and receiving messages was 
modeled. Clearly, if the impact of message traffic was taken into account the results for SMPL 
would only improve more. This may be important since a recent study [Wallace 89] shows that 
an Ethernet channel in a distributed system can have sustained utilization of 50% or higher. In 
such an environment it is advantageous to have a heuristic that can perform well while imposing 
less message traffic on the system. 

5.2 Comparison of the Distributed Heuristics 

Figure 6 compares the response times of DISTED and DMINQ both of which use a thres- 
hold mechanism in making scheduling decisions. A threshold T equal to 2 was used in the 
experiment. This value was chosen because it produced the best response times for both heuris- 
tics. At low loads, both heuristics perform equally well since most of the processes are pro- 
cessed locally. However, as the load increases resulting in more processes being scheduled 
remotely, DMINQ out performs DISTED by up to 18%. 

Figure 6 also contains the response time curve for FDMINQ. Recall that FDMINQ uses a 
filtering scheme to identify and execute small processes locally. Here, all processes with a 
predicted CPU time of less than 2 seconds were filtered. FDMINQ’s response time was up to 
51% (average 21%) lower than that of DISTED and up to 33% (average 17%) lower than that of 
DMINQ. Since the only difference between FDMINQ and DMINQ is the filtering mechanism, 
it is clear that explicit filtering of small processes is the reason for the improved response times. 

Figure 7 shows the response time of only those processes that required less than half a 
second of CPU time. As a result of filtering, these processes execute up to three or four times 
faster at high loads than when DMINQ or DISTED is used. In addition, the variance in the 
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Mean Response 
time in sec. 



Increasing Load 


Figure 7. Response times of the small processes (< 0.5 CPU sec) 
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response times of these processes is also reduced (Figure 8). It’s especially important to main- 
tain consistently low response times for small processes because even slight increases in their 
response times are noticeable to the user. The filtering scheme affects a large segment of the 
process population; 67% of the processes in the trace file require less than half a second of CPU 
time. 

The threshold mechanism executes processes locally when a processor’s load is below a 
threshold, T. As a result, it becomes ineffective if the processor’s load is consistently higher 
than T . Figure 9 illustrates this behavior. However, the filtering mechanism always filters the 
same number of processes, regardless of the load. This is important because as the system load 
increases it becomes more efficient to execute processes locally. For this reason FDMINQ per- 
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Percent 



Figure 9. Percent of processes housed locally 


forms better than DISTED and DMINQ with rising load (see Figures 6 and 7). 

6.0 Conclusion 


This paper has proposed new heuristics for load-sharing which use predicted information 
on process resource usage to make scheduling decisions. Four heuristics were presented. The 
first two, MINQ and SMPL, are centralized heuristics and the remaining two, DMINQ and 
FDMINQ, are distributed heuristics. These heuristics were first compared against random 
scheduling and then against two conventional heuristics, CENTEX and DISTED, which 
schedule processes solely based on system state information. 
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Results based on trace-driven simulations showed that the proposed centralized heuristics 
offer improved mean response times and are less dependent on the status update rate (the rate at 
which status information is collected). The simulations revealed that MINQ and SMPL perform 
as well or better than CENTEX while using up to 70% fewer status update messages. The use 
of fewer status update messages imposes less overhead on the system. In experiments where an 
equal number of status update messages were used, SMPL response times were up to 30% 
(average 22%) lower than those produced by CENTEX and, MINQ response times were on the 
average 18% lower. The use of prediction for distributed scheduling produced similar results. 
Under moderate to high loads, the response times for DMINQ were 18% lower than those of 
DISTED. When prediction was used to filter small processes and execute them locally a 50% 
improvement in response times was obtained. 

Further study includes using prediction for a class of distributed heuristics that use probing 
such as the Shortest heuristic proposed in [Eager 86] and investigating the usefulness of predic- 
tion in real-time scheduling. 
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